Overview

Dataset statistics

Number of variables24
Number of observations1000
Missing cells8
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory195.3 KiB
Average record size in memory200.0 B

Variable types

Categorical13
Text3
Numeric7
Boolean1

Alerts

Operating Certificate Number is highly overall correlated with Facility ID and 1 other fieldsHigh correlation
Facility ID is highly overall correlated with Operating Certificate NumberHigh correlation
Length of Stay is highly overall correlated with Total Charges and 1 other fieldsHigh correlation
APR DRG Code is highly overall correlated with APR MDC Code and 2 other fieldsHigh correlation
APR MDC Code is highly overall correlated with APR DRG Code and 1 other fieldsHigh correlation
Total Charges is highly overall correlated with Length of Stay and 1 other fieldsHigh correlation
Total Costs is highly overall correlated with Length of Stay and 1 other fieldsHigh correlation
Health Service Area is highly overall correlated with Operating Certificate NumberHigh correlation
Age Group is highly overall correlated with APR MDC DescriptionHigh correlation
Type of Admission is highly overall correlated with APR DRG Code and 2 other fieldsHigh correlation
APR MDC Description is highly overall correlated with APR DRG Code and 3 other fieldsHigh correlation
APR Severity of Illness Code is highly overall correlated with APR Severity of Illness Description and 1 other fieldsHigh correlation
APR Severity of Illness Description is highly overall correlated with APR Severity of Illness Code and 1 other fieldsHigh correlation
APR Risk of Mortality is highly overall correlated with APR Severity of Illness Code and 1 other fieldsHigh correlation
Emergency Department Indicator is highly overall correlated with Type of AdmissionHigh correlation
Patient Disposition is highly imbalanced (56.0%)Imbalance

Reproduction

Analysis started2023-06-23 16:18:11.209075
Analysis finished2023-06-23 16:18:37.978918
Duration26.77 seconds
Software versionydata-profiling vv4.3.1
Download configurationconfig.json

Variables

Health Service Area
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)0.8%
Missing2
Missing (%)0.2%
Memory size15.6 KiB
New York City
473 
Long Island
153 
Hudson Valley
100 
Central NY
78 
Finger Lakes
71 
Other values (3)
123 

Length

Max length14
Median length13
Mean length12.271543
Min length10

Characters and Unicode

Total characters12247
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNew York City
2nd rowNew York City
3rd rowNew York City
4th rowLong Island
5th rowNew York City

Common Values

ValueCountFrequency (%)
New York City 473
47.3%
Long Island 153
 
15.3%
Hudson Valley 100
 
10.0%
Central NY 78
 
7.8%
Finger Lakes 71
 
7.1%
Western NY 57
 
5.7%
Capital/Adiron 55
 
5.5%
Southern Tier 11
 
1.1%
(Missing) 2
 
0.2%

Length

2023-06-23T13:18:38.140899image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-23T13:18:38.667572image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
new 473
19.6%
york 473
19.6%
city 473
19.6%
long 153
 
6.3%
island 153
 
6.3%
ny 135
 
5.6%
hudson 100
 
4.1%
valley 100
 
4.1%
central 78
 
3.2%
finger 71
 
2.9%
Other values (5) 205
8.5%

Most occurring characters

ValueCountFrequency (%)
1416
 
11.6%
e 929
 
7.6%
o 792
 
6.5%
r 756
 
6.2%
n 678
 
5.5%
t 674
 
5.5%
i 665
 
5.4%
N 608
 
5.0%
Y 608
 
5.0%
C 606
 
4.9%
Other values (21) 4515
36.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8172
66.7%
Uppercase Letter 2604
 
21.3%
Space Separator 1416
 
11.6%
Other Punctuation 55
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 929
11.4%
o 792
9.7%
r 756
9.3%
n 678
8.3%
t 674
8.2%
i 665
8.1%
y 573
 
7.0%
k 544
 
6.7%
a 512
 
6.3%
l 486
 
5.9%
Other values (7) 1563
19.1%
Uppercase Letter
ValueCountFrequency (%)
N 608
23.3%
Y 608
23.3%
C 606
23.3%
L 224
 
8.6%
I 153
 
5.9%
H 100
 
3.8%
V 100
 
3.8%
F 71
 
2.7%
W 57
 
2.2%
A 55
 
2.1%
Other values (2) 22
 
0.8%
Space Separator
ValueCountFrequency (%)
1416
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 55
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10776
88.0%
Common 1471
 
12.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 929
 
8.6%
o 792
 
7.3%
r 756
 
7.0%
n 678
 
6.3%
t 674
 
6.3%
i 665
 
6.2%
N 608
 
5.6%
Y 608
 
5.6%
C 606
 
5.6%
y 573
 
5.3%
Other values (19) 3887
36.1%
Common
ValueCountFrequency (%)
1416
96.3%
/ 55
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12247
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1416
 
11.6%
e 929
 
7.6%
o 792
 
6.5%
r 756
 
6.2%
n 678
 
5.5%
t 674
 
5.5%
i 665
 
5.4%
N 608
 
5.0%
Y 608
 
5.0%
C 606
 
4.9%
Other values (21) 4515
36.9%
Distinct51
Distinct (%)5.1%
Missing2
Missing (%)0.2%
Memory size15.6 KiB
2023-06-23T13:18:39.150335image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length11
Median length10
Mean length6.9058116
Min length4

Characters and Unicode

Total characters6892
Distinct characters46
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)0.9%

Sample

1st rowKings
2nd rowManhattan
3rd rowManhattan
4th rowSuffolk
5th rowKings
ValueCountFrequency (%)
manhattan 167
16.6%
kings 127
12.6%
nassau 85
 
8.5%
bronx 81
 
8.1%
queens 80
 
8.0%
suffolk 68
 
6.8%
monroe 57
 
5.7%
westchester 44
 
4.4%
erie 42
 
4.2%
onondaga 40
 
4.0%
Other values (42) 214
21.3%
2023-06-23T13:18:39.833269image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 914
13.3%
n 908
13.2%
e 554
 
8.0%
s 535
 
7.8%
t 493
 
7.2%
o 394
 
5.7%
r 317
 
4.6%
u 286
 
4.1%
h 267
 
3.9%
M 229
 
3.3%
Other values (36) 1995
28.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5880
85.3%
Uppercase Letter 1005
 
14.6%
Space Separator 7
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 914
15.5%
n 908
15.4%
e 554
9.4%
s 535
9.1%
t 493
8.4%
o 394
 
6.7%
r 317
 
5.4%
u 286
 
4.9%
h 267
 
4.5%
i 224
 
3.8%
Other values (14) 988
16.8%
Uppercase Letter
ValueCountFrequency (%)
M 229
22.8%
K 127
12.6%
S 95
9.5%
B 90
 
9.0%
N 89
 
8.9%
Q 80
 
8.0%
O 77
 
7.7%
W 54
 
5.4%
E 42
 
4.2%
R 40
 
4.0%
Other values (11) 82
 
8.2%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6885
99.9%
Common 7
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 914
13.3%
n 908
13.2%
e 554
 
8.0%
s 535
 
7.8%
t 493
 
7.2%
o 394
 
5.7%
r 317
 
4.6%
u 286
 
4.2%
h 267
 
3.9%
M 229
 
3.3%
Other values (35) 1988
28.9%
Common
ValueCountFrequency (%)
7
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6892
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 914
13.3%
n 908
13.2%
e 554
 
8.0%
s 535
 
7.8%
t 493
 
7.2%
o 394
 
5.7%
r 317
 
4.6%
u 286
 
4.1%
h 267
 
3.9%
M 229
 
3.3%
Other values (36) 1995
28.9%

Operating Certificate Number
Real number (ℝ)

HIGH CORRELATION 

Distinct157
Distinct (%)15.7%
Missing2
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean5090925.3
Minimum101000
Maximum7004010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.6 KiB
2023-06-23T13:18:40.136308image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum101000
5-th percentile1302000.9
Q12952005
median5907001
Q37002002
95-th percentile7003004
Maximum7004010
Range6903010
Interquartile range (IQR)4049997

Descriptive statistics

Standard deviation2159187.1
Coefficient of variation (CV)0.42412468
Kurtosis-0.93403995
Mean5090925.3
Median Absolute Deviation (MAD)1095053
Skewness-0.66572673
Sum5.0807435 × 109
Variance4.6620889 × 1012
MonotonicityNot monotonic
2023-06-23T13:18:40.501338image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7002054 44
 
4.4%
7000006 33
 
3.3%
2701005 27
 
2.7%
7002024 25
 
2.5%
1401014 24
 
2.4%
2951001 24
 
2.4%
7002032 22
 
2.2%
7001021 21
 
2.1%
7002002 21
 
2.1%
7003004 19
 
1.9%
Other values (147) 738
73.8%
ValueCountFrequency (%)
101000 7
0.7%
101004 7
0.7%
228000 2
 
0.2%
301001 2
 
0.2%
303001 7
0.7%
401001 1
 
0.1%
427000 1
 
0.1%
501000 2
 
0.2%
601000 3
0.3%
602001 3
0.3%
ValueCountFrequency (%)
7004010 5
 
0.5%
7004003 13
1.3%
7003013 9
0.9%
7003010 8
0.8%
7003007 4
 
0.4%
7003006 3
 
0.3%
7003004 19
1.9%
7003003 8
0.8%
7003001 6
 
0.6%
7003000 13
1.3%

Facility ID
Real number (ℝ)

HIGH CORRELATION 

Distinct180
Distinct (%)18.0%
Missing2
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean1045.6012
Minimum1
Maximum3975
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.6 KiB
2023-06-23T13:18:40.900335image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile180.85
Q1541
median1117
Q31450
95-th percentile1698.75
Maximum3975
Range3974
Interquartile range (IQR)909

Descriptive statistics

Standard deviation629.88876
Coefficient of variation (CV)0.60241779
Kurtosis3.7775221
Mean1045.6012
Median Absolute Deviation (MAD)352
Skewness1.2419867
Sum1043510
Variance396759.85
MonotonicityNot monotonic
2023-06-23T13:18:41.450251image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
413 27
 
2.7%
1464 25
 
2.5%
541 22
 
2.2%
1456 21
 
2.1%
1306 21
 
2.1%
1305 19
 
1.9%
245 18
 
1.8%
630 17
 
1.7%
511 16
 
1.6%
1439 15
 
1.5%
Other values (170) 797
79.7%
ValueCountFrequency (%)
1 7
0.7%
5 7
0.7%
39 2
 
0.2%
42 3
0.3%
43 2
 
0.2%
58 4
0.4%
66 1
 
0.1%
85 2
 
0.2%
98 3
0.3%
103 3
0.3%
ValueCountFrequency (%)
3975 6
0.6%
3376 5
0.5%
3297 1
 
0.1%
3067 9
0.9%
3058 11
1.1%
1740 12
1.2%
1738 5
0.5%
1737 1
 
0.1%
1692 6
0.6%
1639 4
 
0.4%
Distinct180
Distinct (%)18.0%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
2023-06-23T13:18:42.117279image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length70
Median length53
Mean length32.195
Min length14

Characters and Unicode

Total characters32195
Distinct characters56
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38 ?
Unique (%)3.8%

Sample

1st rowUniversity Hospital of Brooklyn
2nd rowNYU Hospital for Joint Diseases
3rd rowHospital for Special Surgery
4th rowGood Samaritan Hospital Medical Center
5th rowConey Island Hospital
ValueCountFrequency (%)
hospital 738
 
16.5%
center 429
 
9.6%
medical 274
 
6.1%
177
 
4.0%
st 124
 
2.8%
new 96
 
2.1%
york 96
 
2.1%
university 87
 
1.9%
of 75
 
1.7%
memorial 71
 
1.6%
Other values (280) 2302
51.5%
2023-06-23T13:18:43.242551image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3485
 
10.8%
e 3018
 
9.4%
t 2372
 
7.4%
i 2334
 
7.2%
a 2147
 
6.7%
o 2135
 
6.6%
s 1874
 
5.8%
l 1866
 
5.8%
n 1735
 
5.4%
r 1658
 
5.1%
Other values (46) 9571
29.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 23959
74.4%
Uppercase Letter 4426
 
13.7%
Space Separator 3485
 
10.8%
Dash Punctuation 211
 
0.7%
Other Punctuation 114
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3018
12.6%
t 2372
9.9%
i 2334
9.7%
a 2147
9.0%
o 2135
8.9%
s 1874
7.8%
l 1866
7.8%
n 1735
7.2%
r 1658
6.9%
p 906
 
3.8%
Other values (15) 3914
16.3%
Uppercase Letter
ValueCountFrequency (%)
H 945
21.4%
C 709
16.0%
M 556
12.6%
S 391
8.8%
N 221
 
5.0%
L 161
 
3.6%
U 156
 
3.5%
P 131
 
3.0%
B 128
 
2.9%
D 128
 
2.9%
Other values (14) 900
20.3%
Other Punctuation
ValueCountFrequency (%)
& 38
33.3%
' 34
29.8%
/ 26
22.8%
. 14
 
12.3%
, 2
 
1.8%
Space Separator
ValueCountFrequency (%)
3485
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 211
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 28385
88.2%
Common 3810
 
11.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3018
 
10.6%
t 2372
 
8.4%
i 2334
 
8.2%
a 2147
 
7.6%
o 2135
 
7.5%
s 1874
 
6.6%
l 1866
 
6.6%
n 1735
 
6.1%
r 1658
 
5.8%
H 945
 
3.3%
Other values (39) 8301
29.2%
Common
ValueCountFrequency (%)
3485
91.5%
- 211
 
5.5%
& 38
 
1.0%
' 34
 
0.9%
/ 26
 
0.7%
. 14
 
0.4%
, 2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 32195
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3485
 
10.8%
e 3018
 
9.4%
t 2372
 
7.4%
i 2334
 
7.2%
a 2147
 
6.7%
o 2135
 
6.6%
s 1874
 
5.8%
l 1866
 
5.8%
n 1735
 
5.4%
r 1658
 
5.1%
Other values (46) 9571
29.7%

Age Group
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
50 to 69
296 
70 or Older
268 
30 to 49
201 
0 to 17
124 
18 to 29
111 

Length

Max length11
Median length8
Mean length8.68
Min length7

Characters and Unicode

Total characters8680
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row50 to 69
2nd row30 to 49
3rd row50 to 69
4th row18 to 29
5th row50 to 69

Common Values

ValueCountFrequency (%)
50 to 69 296
29.6%
70 or Older 268
26.8%
30 to 49 201
20.1%
0 to 17 124
12.4%
18 to 29 111
 
11.1%

Length

2023-06-23T13:18:43.589277image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-23T13:18:44.098266image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
to 732
24.4%
50 296
9.9%
69 296
9.9%
70 268
 
8.9%
or 268
 
8.9%
older 268
 
8.9%
30 201
 
6.7%
49 201
 
6.7%
0 124
 
4.1%
17 124
 
4.1%
Other values (2) 222
 
7.4%

Most occurring characters

ValueCountFrequency (%)
2000
23.0%
o 1000
11.5%
0 889
10.2%
t 732
 
8.4%
9 608
 
7.0%
r 536
 
6.2%
7 392
 
4.5%
5 296
 
3.4%
6 296
 
3.4%
d 268
 
3.1%
Other values (8) 1663
19.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3340
38.5%
Lowercase Letter 3072
35.4%
Space Separator 2000
23.0%
Uppercase Letter 268
 
3.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 889
26.6%
9 608
18.2%
7 392
11.7%
5 296
 
8.9%
6 296
 
8.9%
1 235
 
7.0%
3 201
 
6.0%
4 201
 
6.0%
8 111
 
3.3%
2 111
 
3.3%
Lowercase Letter
ValueCountFrequency (%)
o 1000
32.6%
t 732
23.8%
r 536
17.4%
d 268
 
8.7%
e 268
 
8.7%
l 268
 
8.7%
Space Separator
ValueCountFrequency (%)
2000
100.0%
Uppercase Letter
ValueCountFrequency (%)
O 268
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5340
61.5%
Latin 3340
38.5%

Most frequent character per script

Common
ValueCountFrequency (%)
2000
37.5%
0 889
16.6%
9 608
 
11.4%
7 392
 
7.3%
5 296
 
5.5%
6 296
 
5.5%
1 235
 
4.4%
3 201
 
3.8%
4 201
 
3.8%
8 111
 
2.1%
Latin
ValueCountFrequency (%)
o 1000
29.9%
t 732
21.9%
r 536
16.0%
d 268
 
8.0%
e 268
 
8.0%
O 268
 
8.0%
l 268
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8680
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2000
23.0%
o 1000
11.5%
0 889
10.2%
t 732
 
8.4%
9 608
 
7.0%
r 536
 
6.2%
7 392
 
4.5%
5 296
 
3.4%
6 296
 
3.4%
d 268
 
3.1%
Other values (8) 1663
19.2%

Gender
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
F
569 
M
431 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowF
3rd rowM
4th rowM
5th rowM

Common Values

ValueCountFrequency (%)
F 569
56.9%
M 431
43.1%

Length

2023-06-23T13:18:44.583332image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-23T13:18:44.955291image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
f 569
56.9%
m 431
43.1%

Most occurring characters

ValueCountFrequency (%)
F 569
56.9%
M 431
43.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1000
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
F 569
56.9%
M 431
43.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 1000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
F 569
56.9%
M 431
43.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
F 569
56.9%
M 431
43.1%

Race
Categorical

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
White
602 
Black/African American
202 
Other Race
184 
Unknown
 
12

Length

Max length22
Median length5
Mean length9.378
Min length5

Characters and Unicode

Total characters9378
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBlack/African American
2nd rowWhite
3rd rowWhite
4th rowWhite
5th rowBlack/African American

Common Values

ValueCountFrequency (%)
White 602
60.2%
Black/African American 202
 
20.2%
Other Race 184
 
18.4%
Unknown 12
 
1.2%

Length

2023-06-23T13:18:45.220294image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-23T13:18:45.580318image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
white 602
43.4%
black/african 202
 
14.6%
american 202
 
14.6%
other 184
 
13.3%
race 184
 
13.3%
unknown 12
 
0.9%

Most occurring characters

ValueCountFrequency (%)
e 1172
12.5%
i 1006
10.7%
a 790
 
8.4%
c 790
 
8.4%
h 786
 
8.4%
t 786
 
8.4%
W 602
 
6.4%
r 588
 
6.3%
n 440
 
4.7%
A 404
 
4.3%
Other values (12) 2014
21.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7202
76.8%
Uppercase Letter 1588
 
16.9%
Space Separator 386
 
4.1%
Other Punctuation 202
 
2.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1172
16.3%
i 1006
14.0%
a 790
11.0%
c 790
11.0%
h 786
10.9%
t 786
10.9%
r 588
8.2%
n 440
 
6.1%
k 214
 
3.0%
f 202
 
2.8%
Other values (4) 428
 
5.9%
Uppercase Letter
ValueCountFrequency (%)
W 602
37.9%
A 404
25.4%
B 202
 
12.7%
O 184
 
11.6%
R 184
 
11.6%
U 12
 
0.8%
Space Separator
ValueCountFrequency (%)
386
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 202
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8790
93.7%
Common 588
 
6.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1172
13.3%
i 1006
11.4%
a 790
9.0%
c 790
9.0%
h 786
8.9%
t 786
8.9%
W 602
6.8%
r 588
6.7%
n 440
 
5.0%
A 404
 
4.6%
Other values (10) 1426
16.2%
Common
ValueCountFrequency (%)
386
65.6%
/ 202
34.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9378
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1172
12.5%
i 1006
10.7%
a 790
 
8.4%
c 790
 
8.4%
h 786
 
8.4%
t 786
 
8.4%
W 602
 
6.4%
r 588
 
6.3%
n 440
 
4.7%
A 404
 
4.3%
Other values (12) 2014
21.5%

Ethnicity
Categorical

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
Not Span/Hispanic
815 
Spanish/Hispanic
116 
Unknown
 
69

Length

Max length17
Median length17
Mean length16.194
Min length7

Characters and Unicode

Total characters16194
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNot Span/Hispanic
2nd rowNot Span/Hispanic
3rd rowNot Span/Hispanic
4th rowSpanish/Hispanic
5th rowNot Span/Hispanic

Common Values

ValueCountFrequency (%)
Not Span/Hispanic 815
81.5%
Spanish/Hispanic 116
 
11.6%
Unknown 69
 
6.9%

Length

2023-06-23T13:18:45.814244image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-23T13:18:46.136282image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
not 815
44.9%
span/hispanic 815
44.9%
spanish/hispanic 116
 
6.4%
unknown 69
 
3.8%

Most occurring characters

ValueCountFrequency (%)
n 2069
12.8%
i 1978
12.2%
p 1862
11.5%
a 1862
11.5%
s 1047
 
6.5%
/ 931
 
5.7%
c 931
 
5.7%
S 931
 
5.7%
H 931
 
5.7%
o 884
 
5.5%
Other values (7) 2768
17.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11702
72.3%
Uppercase Letter 2746
 
17.0%
Other Punctuation 931
 
5.7%
Space Separator 815
 
5.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 2069
17.7%
i 1978
16.9%
p 1862
15.9%
a 1862
15.9%
s 1047
8.9%
c 931
8.0%
o 884
7.6%
t 815
 
7.0%
h 116
 
1.0%
k 69
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
S 931
33.9%
H 931
33.9%
N 815
29.7%
U 69
 
2.5%
Other Punctuation
ValueCountFrequency (%)
/ 931
100.0%
Space Separator
ValueCountFrequency (%)
815
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 14448
89.2%
Common 1746
 
10.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 2069
14.3%
i 1978
13.7%
p 1862
12.9%
a 1862
12.9%
s 1047
7.2%
c 931
6.4%
S 931
6.4%
H 931
6.4%
o 884
6.1%
N 815
 
5.6%
Other values (5) 1138
7.9%
Common
ValueCountFrequency (%)
/ 931
53.3%
815
46.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16194
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 2069
12.8%
i 1978
12.2%
p 1862
11.5%
a 1862
11.5%
s 1047
 
6.5%
/ 931
 
5.7%
c 931
 
5.7%
S 931
 
5.7%
H 931
 
5.7%
o 884
 
5.5%
Other values (7) 2768
17.1%

Length of Stay
Real number (ℝ)

HIGH CORRELATION 

Distinct39
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.535
Minimum1
Maximum120
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.6 KiB
2023-06-23T13:18:46.684243image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q36
95-th percentile17
Maximum120
Range119
Interquartile range (IQR)4

Descriptive statistics

Standard deviation8.6455492
Coefficient of variation (CV)1.5619782
Kurtosis76.905785
Mean5.535
Median Absolute Deviation (MAD)2
Skewness7.229994
Sum5535
Variance74.745521
MonotonicityNot monotonic
2023-06-23T13:18:46.968066image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
2 216
21.6%
1 178
17.8%
3 158
15.8%
4 99
9.9%
5 66
 
6.6%
6 64
 
6.4%
7 38
 
3.8%
8 31
 
3.1%
9 18
 
1.8%
11 17
 
1.7%
Other values (29) 115
11.5%
ValueCountFrequency (%)
1 178
17.8%
2 216
21.6%
3 158
15.8%
4 99
9.9%
5 66
 
6.6%
6 64
 
6.4%
7 38
 
3.8%
8 31
 
3.1%
9 18
 
1.8%
10 14
 
1.4%
ValueCountFrequency (%)
120 2
0.2%
90 1
0.1%
75 1
0.1%
61 1
0.1%
48 1
0.1%
47 1
0.1%
42 1
0.1%
38 1
0.1%
36 2
0.2%
31 1
0.1%

Type of Admission
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
Emergency
631 
Elective
176 
Urgent
105 
Newborn
87 
Not Available
 
1

Length

Max length13
Median length9
Mean length8.339
Min length6

Characters and Unicode

Total characters8339
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowEmergency
2nd rowElective
3rd rowElective
4th rowEmergency
5th rowEmergency

Common Values

ValueCountFrequency (%)
Emergency 631
63.1%
Elective 176
 
17.6%
Urgent 105
 
10.5%
Newborn 87
 
8.7%
Not Available 1
 
0.1%

Length

2023-06-23T13:18:47.455195image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-23T13:18:49.157886image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
emergency 631
63.0%
elective 176
 
17.6%
urgent 105
 
10.5%
newborn 87
 
8.7%
not 1
 
0.1%
available 1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
e 1807
21.7%
r 823
9.9%
n 823
9.9%
E 807
9.7%
c 807
9.7%
g 736
8.8%
y 631
 
7.6%
m 631
 
7.6%
t 282
 
3.4%
l 178
 
2.1%
Other values (10) 814
9.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7337
88.0%
Uppercase Letter 1001
 
12.0%
Space Separator 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1807
24.6%
r 823
11.2%
n 823
11.2%
c 807
11.0%
g 736
10.0%
y 631
 
8.6%
m 631
 
8.6%
t 282
 
3.8%
l 178
 
2.4%
v 177
 
2.4%
Other values (5) 442
 
6.0%
Uppercase Letter
ValueCountFrequency (%)
E 807
80.6%
U 105
 
10.5%
N 88
 
8.8%
A 1
 
0.1%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8338
> 99.9%
Common 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1807
21.7%
r 823
9.9%
n 823
9.9%
E 807
9.7%
c 807
9.7%
g 736
8.8%
y 631
 
7.6%
m 631
 
7.6%
t 282
 
3.4%
l 178
 
2.1%
Other values (9) 813
9.8%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8339
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1807
21.7%
r 823
9.9%
n 823
9.9%
E 807
9.7%
c 807
9.7%
g 736
8.8%
y 631
 
7.6%
m 631
 
7.6%
t 282
 
3.4%
l 178
 
2.1%
Other values (10) 814
9.8%

Patient Disposition
Categorical

IMBALANCE 

Distinct16
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
Home or Self Care
661 
Home w/ Home Health Services
141 
Skilled Nursing Home
86 
Left Against Medical Advice
 
28
Short-term Hospital
 
26
Other values (11)
 
58

Length

Max length37
Median length17
Mean length19.401
Min length7

Characters and Unicode

Total characters19401
Distinct characters42
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)0.5%

Sample

1st rowHome or Self Care
2nd rowHome w/ Home Health Services
3rd rowHome or Self Care
4th rowHome or Self Care
5th rowHome or Self Care

Common Values

ValueCountFrequency (%)
Home or Self Care 661
66.1%
Home w/ Home Health Services 141
 
14.1%
Skilled Nursing Home 86
 
8.6%
Left Against Medical Advice 28
 
2.8%
Short-term Hospital 26
 
2.6%
Expired 24
 
2.4%
Inpatient Rehabilitation Facility 16
 
1.6%
Psychiatric Hospital or Unit of Hosp 5
 
0.5%
Hospice - Home 3
 
0.3%
Facility w/ Custodial/Supportive Care 3
 
0.3%
Other values (6) 7
 
0.7%

Length

2023-06-23T13:18:52.460001image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
home 1032
26.3%
or 667
17.0%
care 666
17.0%
self 661
16.8%
w 144
 
3.7%
health 141
 
3.6%
services 141
 
3.6%
skilled 86
 
2.2%
nursing 86
 
2.2%
hospital 34
 
0.9%
Other values (33) 269
 
6.9%

Most occurring characters

ValueCountFrequency (%)
e 3063
15.8%
2927
15.1%
o 1803
9.3%
r 1658
8.5%
H 1217
 
6.3%
l 1077
 
5.6%
m 1061
 
5.5%
a 980
 
5.1%
S 918
 
4.7%
f 695
 
3.6%
Other values (32) 4002
20.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13184
68.0%
Uppercase Letter 3111
 
16.0%
Space Separator 2927
 
15.1%
Other Punctuation 149
 
0.8%
Dash Punctuation 30
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3063
23.2%
o 1803
13.7%
r 1658
12.6%
l 1077
 
8.2%
m 1061
 
8.0%
a 980
 
7.4%
f 695
 
5.3%
i 591
 
4.5%
t 391
 
3.0%
s 310
 
2.4%
Other values (13) 1555
11.8%
Uppercase Letter
ValueCountFrequency (%)
H 1217
39.1%
S 918
29.5%
C 675
21.7%
N 87
 
2.8%
A 58
 
1.9%
L 32
 
1.0%
M 32
 
1.0%
E 25
 
0.8%
F 20
 
0.6%
R 16
 
0.5%
Other values (5) 31
 
1.0%
Other Punctuation
ValueCountFrequency (%)
/ 148
99.3%
' 1
 
0.7%
Space Separator
ValueCountFrequency (%)
2927
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 30
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16295
84.0%
Common 3106
 
16.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3063
18.8%
o 1803
11.1%
r 1658
10.2%
H 1217
 
7.5%
l 1077
 
6.6%
m 1061
 
6.5%
a 980
 
6.0%
S 918
 
5.6%
f 695
 
4.3%
C 675
 
4.1%
Other values (28) 3148
19.3%
Common
ValueCountFrequency (%)
2927
94.2%
/ 148
 
4.8%
- 30
 
1.0%
' 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19401
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3063
15.8%
2927
15.1%
o 1803
9.3%
r 1658
8.5%
H 1217
 
6.3%
l 1077
 
5.6%
m 1061
 
5.5%
a 980
 
5.1%
S 918
 
4.7%
f 695
 
3.6%
Other values (32) 4002
20.6%

APR DRG Code
Real number (ℝ)

HIGH CORRELATION 

Distinct201
Distinct (%)20.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean409.292
Minimum1
Maximum952
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.6 KiB
2023-06-23T13:18:52.807581image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile58
Q1196.75
median380
Q3640
95-th percentile775
Maximum952
Range951
Interquartile range (IQR)443.25

Descriptive statistics

Standard deviation241.81182
Coefficient of variation (CV)0.59080514
Kurtosis-1.1533972
Mean409.292
Median Absolute Deviation (MAD)187.5
Skewness0.29453521
Sum409292
Variance58472.956
MonotonicityNot monotonic
2023-06-23T13:18:53.090404image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
640 80
 
8.0%
560 58
 
5.8%
140 29
 
2.9%
383 24
 
2.4%
720 24
 
2.4%
194 23
 
2.3%
540 23
 
2.3%
203 22
 
2.2%
463 20
 
2.0%
175 19
 
1.9%
Other values (191) 678
67.8%
ValueCountFrequency (%)
1 1
 
0.1%
2 1
 
0.1%
3 1
 
0.1%
4 1
 
0.1%
5 2
0.2%
20 1
 
0.1%
21 3
0.3%
24 2
0.2%
26 1
 
0.1%
41 2
0.2%
ValueCountFrequency (%)
952 2
 
0.2%
951 4
 
0.4%
950 2
 
0.2%
930 1
 
0.1%
894 2
 
0.2%
893 1
 
0.1%
892 1
 
0.1%
890 1
 
0.1%
861 8
0.8%
860 14
1.4%
Distinct201
Distinct (%)20.1%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
2023-06-23T13:18:53.439188image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length89
Median length57
Mean length36.071
Min length5

Characters and Unicode

Total characters36071
Distinct characters42
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique68 ?
Unique (%)6.8%

Sample

1st rowSEIZURE
2nd rowHIP JOINT REPLACEMENT
3rd rowKNEE & LOWER LEG PROCEDURES EXCEPT FOOT
4th rowINFECTIONS OF UPPER RESPIRATORY TRACT
5th rowPEPTIC ULCER & GASTRITIS
ValueCountFrequency (%)
428
 
8.8%
other 249
 
5.1%
neonate 172
 
3.5%
w 126
 
2.6%
or 126
 
2.6%
procedures 124
 
2.5%
disorders 96
 
2.0%
infections 93
 
1.9%
newborn 83
 
1.7%
problem 83
 
1.7%
Other values (390) 3290
67.6%
2023-06-23T13:18:54.187889image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3870
 
10.7%
E 3773
 
10.5%
R 2925
 
8.1%
O 2611
 
7.2%
A 2483
 
6.9%
I 2415
 
6.7%
N 2255
 
6.3%
T 2171
 
6.0%
S 1853
 
5.1%
C 1619
 
4.5%
Other values (32) 10096
28.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 30963
85.8%
Space Separator 3870
 
10.7%
Other Punctuation 713
 
2.0%
Decimal Number 387
 
1.1%
Math Symbol 89
 
0.2%
Dash Punctuation 49
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 3773
12.2%
R 2925
 
9.4%
O 2611
 
8.4%
A 2483
 
8.0%
I 2415
 
7.8%
N 2255
 
7.3%
T 2171
 
7.0%
S 1853
 
6.0%
C 1619
 
5.2%
L 1279
 
4.1%
Other values (16) 7579
24.5%
Decimal Number
ValueCountFrequency (%)
9 182
47.0%
2 90
23.3%
4 86
22.2%
0 16
 
4.1%
1 6
 
1.6%
6 4
 
1.0%
5 3
 
0.8%
Other Punctuation
ValueCountFrequency (%)
& 431
60.4%
, 206
28.9%
/ 64
 
9.0%
. 12
 
1.7%
Math Symbol
ValueCountFrequency (%)
> 83
93.3%
+ 4
 
4.5%
< 2
 
2.2%
Space Separator
ValueCountFrequency (%)
3870
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 49
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 30963
85.8%
Common 5108
 
14.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 3773
12.2%
R 2925
 
9.4%
O 2611
 
8.4%
A 2483
 
8.0%
I 2415
 
7.8%
N 2255
 
7.3%
T 2171
 
7.0%
S 1853
 
6.0%
C 1619
 
5.2%
L 1279
 
4.1%
Other values (16) 7579
24.5%
Common
ValueCountFrequency (%)
3870
75.8%
& 431
 
8.4%
, 206
 
4.0%
9 182
 
3.6%
2 90
 
1.8%
4 86
 
1.7%
> 83
 
1.6%
/ 64
 
1.3%
- 49
 
1.0%
0 16
 
0.3%
Other values (6) 31
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36071
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3870
 
10.7%
E 3773
 
10.5%
R 2925
 
8.1%
O 2611
 
7.2%
A 2483
 
6.9%
I 2415
 
6.7%
N 2255
 
6.3%
T 2171
 
6.0%
S 1853
 
5.1%
C 1619
 
4.5%
Other values (32) 10096
28.0%

APR MDC Code
Real number (ℝ)

HIGH CORRELATION 

Distinct24
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.207
Minimum1
Maximum25
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.6 KiB
2023-06-23T13:18:54.629093image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q15
median9
Q315
95-th percentile20
Maximum25
Range24
Interquartile range (IQR)10

Descriptive statistics

Standard deviation5.8900631
Coefficient of variation (CV)0.57706115
Kurtosis-0.91269768
Mean10.207
Median Absolute Deviation (MAD)5
Skewness0.42526501
Sum10207
Variance34.692844
MonotonicityNot monotonic
2023-06-23T13:18:54.760287image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
5 152
15.2%
4 106
10.6%
14 105
10.5%
15 89
8.9%
8 81
 
8.1%
6 75
 
7.5%
11 51
 
5.1%
1 47
 
4.7%
9 39
 
3.9%
20 39
 
3.9%
Other values (14) 216
21.6%
ValueCountFrequency (%)
1 47
 
4.7%
3 11
 
1.1%
4 106
10.6%
5 152
15.2%
6 75
7.5%
7 21
 
2.1%
8 81
8.1%
9 39
 
3.9%
10 24
 
2.4%
11 51
 
5.1%
ValueCountFrequency (%)
25 1
 
0.1%
24 5
 
0.5%
23 23
2.3%
22 3
 
0.3%
21 8
 
0.8%
20 39
3.9%
19 37
3.7%
18 38
3.8%
17 11
 
1.1%
16 15
 
1.5%

APR MDC Description
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
Diseases and Disorders of the Circulatory System
152 
Diseases and Disorders of the Respiratory System
106 
Pregnancy, Childbirth and the Puerperium
105 
Newborns and Other Neonates with Conditions Originating in the Perinatal Period
89 
Diseases and Disorders of the Musculoskeletal System and Conn Tissue
81 
Other values (19)
467 

Length

Max length100
Median length78
Mean length55.74
Min length5

Characters and Unicode

Total characters55740
Distinct characters46
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowDiseases and Disorders of the Nervous System
2nd rowDiseases and Disorders of the Musculoskeletal System and Conn Tissue
3rd rowDiseases and Disorders of the Musculoskeletal System and Conn Tissue
4th rowEar, Nose, Mouth, Throat and Craniofacial Diseases and Disorders
5th rowDiseases and Disorders of the Digestive System

Common Values

ValueCountFrequency (%)
Diseases and Disorders of the Circulatory System 152
15.2%
Diseases and Disorders of the Respiratory System 106
10.6%
Pregnancy, Childbirth and the Puerperium 105
10.5%
Newborns and Other Neonates with Conditions Originating in the Perinatal Period 89
8.9%
Diseases and Disorders of the Musculoskeletal System and Conn Tissue 81
 
8.1%
Diseases and Disorders of the Digestive System 75
 
7.5%
Diseases and Disorders of the Kidney and Urinary Tract 51
 
5.1%
Diseases and Disorders of the Nervous System 47
 
4.7%
Diseases and Disorders of the Skin, Subcutaneous Tissue and Breast 39
 
3.9%
Alcohol/Drug Use and Alcohol/Drug Induced Organic Mental Disorders 39
 
3.9%
Other values (14) 216
21.6%

Length

2023-06-23T13:18:55.033112image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
and 1233
15.8%
the 785
 
10.0%
disorders 732
 
9.4%
diseases 716
 
9.2%
of 614
 
7.9%
system 501
 
6.4%
other 162
 
2.1%
circulatory 152
 
1.9%
tissue 120
 
1.5%
respiratory 106
 
1.4%
Other values (76) 2698
34.5%

Most occurring characters

ValueCountFrequency (%)
6819
12.2%
e 5897
10.6%
s 5497
 
9.9%
i 4000
 
7.2%
r 3661
 
6.6%
a 3641
 
6.5%
t 3227
 
5.8%
n 3083
 
5.5%
o 2953
 
5.3%
d 2504
 
4.5%
Other values (36) 14458
25.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 43445
77.9%
Space Separator 6819
 
12.2%
Uppercase Letter 5049
 
9.1%
Other Punctuation 427
 
0.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 5897
13.6%
s 5497
12.7%
i 4000
9.2%
r 3661
8.4%
a 3641
8.4%
t 3227
7.4%
n 3083
7.1%
o 2953
 
6.8%
d 2504
 
5.8%
h 1459
 
3.4%
Other values (14) 7523
17.3%
Uppercase Letter
ValueCountFrequency (%)
D 1601
31.7%
S 702
13.9%
C 480
 
9.5%
P 455
 
9.0%
O 305
 
6.0%
N 260
 
5.1%
M 212
 
4.2%
T 199
 
3.9%
R 159
 
3.1%
I 133
 
2.6%
Other values (9) 543
 
10.8%
Other Punctuation
ValueCountFrequency (%)
, 349
81.7%
/ 78
 
18.3%
Space Separator
ValueCountFrequency (%)
6819
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 48494
87.0%
Common 7246
 
13.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 5897
12.2%
s 5497
11.3%
i 4000
 
8.2%
r 3661
 
7.5%
a 3641
 
7.5%
t 3227
 
6.7%
n 3083
 
6.4%
o 2953
 
6.1%
d 2504
 
5.2%
D 1601
 
3.3%
Other values (33) 12430
25.6%
Common
ValueCountFrequency (%)
6819
94.1%
, 349
 
4.8%
/ 78
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 55740
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6819
12.2%
e 5897
10.6%
s 5497
 
9.9%
i 4000
 
7.2%
r 3661
 
6.6%
a 3641
 
6.5%
t 3227
 
5.8%
n 3083
 
5.5%
o 2953
 
5.3%
d 2504
 
4.5%
Other values (36) 14458
25.9%

APR Severity of Illness Code
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
2
386 
1
352 
3
203 
4
59 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row3

Common Values

ValueCountFrequency (%)
2 386
38.6%
1 352
35.2%
3 203
20.3%
4 59
 
5.9%

Length

2023-06-23T13:18:55.264944image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-23T13:18:55.514793image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
2 386
38.6%
1 352
35.2%
3 203
20.3%
4 59
 
5.9%

Most occurring characters

ValueCountFrequency (%)
2 386
38.6%
1 352
35.2%
3 203
20.3%
4 59
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 386
38.6%
1 352
35.2%
3 203
20.3%
4 59
 
5.9%

Most occurring scripts

ValueCountFrequency (%)
Common 1000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 386
38.6%
1 352
35.2%
3 203
20.3%
4 59
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 386
38.6%
1 352
35.2%
3 203
20.3%
4 59
 
5.9%

APR Severity of Illness Description
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
Moderate
386 
Minor
352 
Major
203 
Extreme
59 

Length

Max length8
Median length5
Mean length6.276
Min length5

Characters and Unicode

Total characters6276
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowModerate
2nd rowModerate
3rd rowModerate
4th rowModerate
5th rowMajor

Common Values

ValueCountFrequency (%)
Moderate 386
38.6%
Minor 352
35.2%
Major 203
20.3%
Extreme 59
 
5.9%

Length

2023-06-23T13:18:55.777632image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-23T13:18:56.099431image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
moderate 386
38.6%
minor 352
35.2%
major 203
20.3%
extreme 59
 
5.9%

Most occurring characters

ValueCountFrequency (%)
r 1000
15.9%
M 941
15.0%
o 941
15.0%
e 890
14.2%
a 589
9.4%
t 445
7.1%
d 386
 
6.2%
i 352
 
5.6%
n 352
 
5.6%
j 203
 
3.2%
Other values (3) 177
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5276
84.1%
Uppercase Letter 1000
 
15.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 1000
19.0%
o 941
17.8%
e 890
16.9%
a 589
11.2%
t 445
8.4%
d 386
 
7.3%
i 352
 
6.7%
n 352
 
6.7%
j 203
 
3.8%
x 59
 
1.1%
Uppercase Letter
ValueCountFrequency (%)
M 941
94.1%
E 59
 
5.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 6276
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 1000
15.9%
M 941
15.0%
o 941
15.0%
e 890
14.2%
a 589
9.4%
t 445
7.1%
d 386
 
6.2%
i 352
 
5.6%
n 352
 
5.6%
j 203
 
3.2%
Other values (3) 177
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6276
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 1000
15.9%
M 941
15.0%
o 941
15.0%
e 890
14.2%
a 589
9.4%
t 445
7.1%
d 386
 
6.2%
i 352
 
5.6%
n 352
 
5.6%
j 203
 
3.2%
Other values (3) 177
 
2.8%

APR Risk of Mortality
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
Minor
623 
Moderate
220 
Major
104 
Extreme
 
53

Length

Max length8
Median length5
Mean length5.766
Min length5

Characters and Unicode

Total characters5766
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowModerate
2nd rowMinor
3rd rowMinor
4th rowMinor
5th rowModerate

Common Values

ValueCountFrequency (%)
Minor 623
62.3%
Moderate 220
 
22.0%
Major 104
 
10.4%
Extreme 53
 
5.3%

Length

2023-06-23T13:18:56.383255image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-23T13:18:56.652088image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
minor 623
62.3%
moderate 220
 
22.0%
major 104
 
10.4%
extreme 53
 
5.3%

Most occurring characters

ValueCountFrequency (%)
r 1000
17.3%
M 947
16.4%
o 947
16.4%
i 623
10.8%
n 623
10.8%
e 546
9.5%
a 324
 
5.6%
t 273
 
4.7%
d 220
 
3.8%
j 104
 
1.8%
Other values (3) 159
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4766
82.7%
Uppercase Letter 1000
 
17.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 1000
21.0%
o 947
19.9%
i 623
13.1%
n 623
13.1%
e 546
11.5%
a 324
 
6.8%
t 273
 
5.7%
d 220
 
4.6%
j 104
 
2.2%
x 53
 
1.1%
Uppercase Letter
ValueCountFrequency (%)
M 947
94.7%
E 53
 
5.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 5766
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 1000
17.3%
M 947
16.4%
o 947
16.4%
i 623
10.8%
n 623
10.8%
e 546
9.5%
a 324
 
5.6%
t 273
 
4.7%
d 220
 
3.8%
j 104
 
1.8%
Other values (3) 159
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 1000
17.3%
M 947
16.4%
o 947
16.4%
i 623
10.8%
n 623
10.8%
e 546
9.5%
a 324
 
5.6%
t 273
 
4.7%
d 220
 
3.8%
j 104
 
1.8%
Other values (3) 159
 
2.8%
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
Medical
769 
Surgical
231 

Length

Max length8
Median length7
Mean length7.231
Min length7

Characters and Unicode

Total characters7231
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMedical
2nd rowSurgical
3rd rowSurgical
4th rowMedical
5th rowMedical

Common Values

ValueCountFrequency (%)
Medical 769
76.9%
Surgical 231
 
23.1%

Length

2023-06-23T13:18:56.893938image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-23T13:18:57.129791image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
medical 769
76.9%
surgical 231
 
23.1%

Most occurring characters

ValueCountFrequency (%)
i 1000
13.8%
c 1000
13.8%
a 1000
13.8%
l 1000
13.8%
M 769
10.6%
e 769
10.6%
d 769
10.6%
S 231
 
3.2%
u 231
 
3.2%
r 231
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6231
86.2%
Uppercase Letter 1000
 
13.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1000
16.0%
c 1000
16.0%
a 1000
16.0%
l 1000
16.0%
e 769
12.3%
d 769
12.3%
u 231
 
3.7%
r 231
 
3.7%
g 231
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
M 769
76.9%
S 231
 
23.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 7231
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1000
13.8%
c 1000
13.8%
a 1000
13.8%
l 1000
13.8%
M 769
10.6%
e 769
10.6%
d 769
10.6%
S 231
 
3.2%
u 231
 
3.2%
r 231
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7231
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1000
13.8%
c 1000
13.8%
a 1000
13.8%
l 1000
13.8%
M 769
10.6%
e 769
10.6%
d 769
10.6%
S 231
 
3.2%
u 231
 
3.2%
r 231
 
3.2%
Distinct8
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
Medicare
345 
Insurance Company
312 
Medicaid
149 
Blue Cross
135 
Self-Pay
45 
Other values (3)
 
14

Length

Max length25
Median length8
Mean length11.262
Min length8

Characters and Unicode

Total characters11262
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowInsurance Company
2nd rowInsurance Company
3rd rowOther Non-Federal Program
4th rowMedicare
5th rowMedicaid

Common Values

ValueCountFrequency (%)
Medicare 345
34.5%
Insurance Company 312
31.2%
Medicaid 149
14.9%
Blue Cross 135
 
13.5%
Self-Pay 45
 
4.5%
Workers Compensation 10
 
1.0%
Other Non-Federal Program 3
 
0.3%
Other Federal Program 1
 
0.1%

Length

2023-06-23T13:18:57.332666image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-23T13:18:57.680450image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
medicare 345
23.5%
insurance 312
21.3%
company 312
21.3%
medicaid 149
10.2%
blue 135
 
9.2%
cross 135
 
9.2%
self-pay 45
 
3.1%
workers 10
 
0.7%
compensation 10
 
0.7%
other 4
 
0.3%
Other values (3) 8
 
0.5%

Most occurring characters

ValueCountFrequency (%)
e 1363
12.1%
a 1181
 
10.5%
n 959
 
8.5%
r 828
 
7.4%
c 806
 
7.2%
i 653
 
5.8%
d 647
 
5.7%
s 602
 
5.3%
M 494
 
4.4%
o 484
 
4.3%
Other values (21) 3245
28.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9236
82.0%
Uppercase Letter 1513
 
13.4%
Space Separator 465
 
4.1%
Dash Punctuation 48
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1363
14.8%
a 1181
12.8%
n 959
10.4%
r 828
9.0%
c 806
8.7%
i 653
7.1%
d 647
7.0%
s 602
6.5%
o 484
 
5.2%
u 447
 
4.8%
Other values (9) 1266
13.7%
Uppercase Letter
ValueCountFrequency (%)
M 494
32.7%
C 457
30.2%
I 312
20.6%
B 135
 
8.9%
P 49
 
3.2%
S 45
 
3.0%
W 10
 
0.7%
O 4
 
0.3%
F 4
 
0.3%
N 3
 
0.2%
Space Separator
ValueCountFrequency (%)
465
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 48
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10749
95.4%
Common 513
 
4.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1363
12.7%
a 1181
11.0%
n 959
 
8.9%
r 828
 
7.7%
c 806
 
7.5%
i 653
 
6.1%
d 647
 
6.0%
s 602
 
5.6%
M 494
 
4.6%
o 484
 
4.5%
Other values (19) 2732
25.4%
Common
ValueCountFrequency (%)
465
90.6%
- 48
 
9.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11262
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1363
12.1%
a 1181
 
10.5%
n 959
 
8.5%
r 828
 
7.4%
c 806
 
7.2%
i 653
 
5.8%
d 647
 
5.7%
s 602
 
5.3%
M 494
 
4.4%
o 484
 
4.3%
Other values (21) 3245
28.8%

Emergency Department Indicator
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
True
573 
False
427 
ValueCountFrequency (%)
True 573
57.3%
False 427
42.7%
2023-06-23T13:18:58.160152image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Total Charges
Real number (ℝ)

HIGH CORRELATION 

Distinct992
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29793.46
Minimum838.41
Maximum793781.42
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.6 KiB
2023-06-23T13:18:58.391009image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum838.41
5-th percentile2929.65
Q18020.5
median16743.115
Q332105.45
95-th percentile88238.435
Maximum793781.42
Range792943.01
Interquartile range (IQR)24084.95

Descriptive statistics

Standard deviation53291.025
Coefficient of variation (CV)1.788682
Kurtosis77.483408
Mean29793.46
Median Absolute Deviation (MAD)9979.94
Skewness7.4575585
Sum29793460
Variance2.8399333 × 109
MonotonicityNot monotonic
2023-06-23T13:18:58.873806image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7900 3
 
0.3%
6977.64 3
 
0.3%
8400 2
 
0.2%
31399.38 2
 
0.2%
11400 2
 
0.2%
1815 2
 
0.2%
4636.78 1
 
0.1%
15362.54 1
 
0.1%
16151.35 1
 
0.1%
75648.39 1
 
0.1%
Other values (982) 982
98.2%
ValueCountFrequency (%)
838.41 1
0.1%
1135.4 1
0.1%
1160.58 1
0.1%
1229 1
0.1%
1249.5 1
0.1%
1320 1
0.1%
1344.4 1
0.1%
1468 1
0.1%
1517.12 1
0.1%
1520 1
0.1%
ValueCountFrequency (%)
793781.42 1
0.1%
610993.5 1
0.1%
557933.48 1
0.1%
465812.87 1
0.1%
456332.22 1
0.1%
403371.55 1
0.1%
272901.37 1
0.1%
252405.89 1
0.1%
248317.8 1
0.1%
246308.42 1
0.1%

Total Costs
Real number (ℝ)

HIGH CORRELATION 

Distinct992
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12062.908
Minimum218.92
Maximum390790.56
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.6 KiB
2023-06-23T13:18:59.370541image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum218.92
5-th percentile1383.838
Q13395.0825
median6255.42
Q312700.9
95-th percentile36301.649
Maximum390790.56
Range390571.64
Interquartile range (IQR)9305.8175

Descriptive statistics

Standard deviation23517.768
Coefficient of variation (CV)1.9495935
Kurtosis116.83331
Mean12062.908
Median Absolute Deviation (MAD)3690.295
Skewness9.1241533
Sum12062908
Variance5.5308541 × 108
MonotonicityNot monotonic
2023-06-23T13:18:59.872299image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2450.55 3
 
0.3%
1895.47 3
 
0.3%
2605.65 2
 
0.2%
8529.63 2
 
0.2%
3536.23 2
 
0.2%
1695.56 2
 
0.2%
3501.6 1
 
0.1%
4193.11 1
 
0.1%
7006.04 1
 
0.1%
22662.4 1
 
0.1%
Other values (982) 982
98.2%
ValueCountFrequency (%)
218.92 1
0.1%
435.5 1
0.1%
471.5 1
0.1%
582.39 1
0.1%
629.34 1
0.1%
631 1
0.1%
695.31 1
0.1%
706.71 1
0.1%
732.93 1
0.1%
770.88 1
0.1%
ValueCountFrequency (%)
390790.56 1
0.1%
336406.78 1
0.1%
226758.29 1
0.1%
204738.48 1
0.1%
159131.61 1
0.1%
130717.81 1
0.1%
109088.18 1
0.1%
102223.83 1
0.1%
98325.46 1
0.1%
94684.57 1
0.1%

Interactions

2023-06-23T13:18:32.127615image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:14.664229image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:18.085123image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:20.923354image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:23.305044image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:26.097446image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:29.081663image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:32.526100image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:15.211890image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:18.520873image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:21.276135image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:23.845702image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:26.464219image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:29.593347image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:32.938843image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:15.688190image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:18.906647image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:21.814517image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:24.373479image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:26.967907image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:30.169988image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:33.325603image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:16.034984image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:19.303884image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:22.047373image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:24.641101image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:27.428105image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:30.568742image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:33.756336image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:16.467931image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:19.688837image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:22.328059image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:24.839979image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:27.806456image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:30.962498image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:34.171080image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:16.925197image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:20.079124image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:22.546518image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:25.270445image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:28.054302image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:31.322458image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:34.595815image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:17.494082image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:20.589812image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:22.926271image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:25.661717image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:28.614954image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-23T13:18:31.720211image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2023-06-23T13:19:00.274208image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Operating Certificate NumberFacility IDLength of StayAPR DRG CodeAPR MDC CodeTotal ChargesTotal CostsHealth Service AreaAge GroupGenderRaceEthnicityType of AdmissionPatient DispositionAPR MDC DescriptionAPR Severity of Illness CodeAPR Severity of Illness DescriptionAPR Risk of MortalityAPR Medical Surgical DescriptionSource of Payment 1Emergency Department Indicator
Operating Certificate Number1.0000.924-0.0870.0600.0610.0800.0680.6660.0670.0240.2530.1670.0860.0260.0740.0980.0980.0890.0760.0720.000
Facility ID0.9241.000-0.0920.0600.0570.0410.0340.4280.0950.0400.1850.1820.0980.0320.0640.0780.0780.0660.1160.0630.107
Length of Stay-0.087-0.0921.000-0.0080.0290.6620.7020.0000.0400.0470.0000.0260.0470.1760.1740.2540.2540.2560.0830.0000.000
APR DRG Code0.0600.060-0.0081.0000.962-0.229-0.1780.0000.4760.2820.0810.0530.5080.1770.9320.2280.2280.2350.4320.1830.442
APR MDC Code0.0610.0570.0290.9621.000-0.201-0.1480.0170.3530.1560.0720.0000.3590.1300.9930.2070.2070.2150.3310.1690.396
Total Charges0.0800.0410.662-0.229-0.2011.0000.8880.0000.0000.0000.0000.0000.0890.2810.1470.2490.2490.2340.2250.0000.068
Total Costs0.0680.0340.702-0.178-0.1480.8881.0000.0000.0000.0640.0000.0000.1040.1990.2000.2040.2040.2010.1700.0000.064
Health Service Area0.6660.4280.0000.0000.0170.0000.0001.0000.0330.0000.2560.1410.0850.0510.0000.0650.0650.0810.0310.0880.064
Age Group0.0670.0950.0400.4760.3530.0000.0000.0331.0000.1690.1180.0240.4230.2540.5200.2280.2280.3200.1600.3460.301
Gender0.0240.0400.0470.2820.1560.0000.0640.0000.1691.0000.0000.0000.1110.0780.3230.0420.0420.0050.0000.0990.000
Race0.2530.1850.0000.0810.0720.0000.0000.2560.1180.0001.0000.3460.0420.0800.0800.0670.0670.0780.0900.1650.000
Ethnicity0.1670.1820.0260.0530.0000.0000.0000.1410.0240.0000.3461.0000.0600.0000.0490.0420.0420.0680.0780.1030.066
Type of Admission0.0860.0980.0470.5080.3590.0890.1040.0850.4230.1110.0420.0601.0000.0900.5510.1750.1750.1700.4330.1380.778
Patient Disposition0.0260.0320.1760.1770.1300.2810.1990.0510.2540.0780.0800.0000.0901.0000.1530.3470.3470.3380.1600.1570.127
APR MDC Description0.0740.0640.1740.9320.9930.1470.2000.0000.5200.3230.0800.0490.5510.1531.0000.2790.2790.2790.4620.1900.470
APR Severity of Illness Code0.0980.0780.2540.2280.2070.2490.2040.0650.2280.0420.0670.0420.1750.3470.2791.0001.0000.5880.0000.1770.254
APR Severity of Illness Description0.0980.0780.2540.2280.2070.2490.2040.0650.2280.0420.0670.0420.1750.3470.2791.0001.0000.5880.0000.1770.254
APR Risk of Mortality0.0890.0660.2560.2350.2150.2340.2010.0810.3200.0050.0780.0680.1700.3380.2790.5880.5881.0000.0000.2650.245
APR Medical Surgical Description0.0760.1160.0830.4320.3310.2250.1700.0310.1600.0000.0900.0780.4330.1600.4620.0000.0000.0001.0000.1350.232
Source of Payment 10.0720.0630.0000.1830.1690.0000.0000.0880.3460.0990.1650.1030.1380.1570.1900.1770.1770.2650.1351.0000.189
Emergency Department Indicator0.0000.1070.0000.4420.3960.0680.0640.0640.3010.0000.0000.0660.7780.1270.4700.2540.2540.2450.2320.1891.000

Missing values

2023-06-23T13:18:35.169637image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-06-23T13:18:36.802405image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-06-23T13:18:37.689098image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Health Service AreaHospital CountyOperating Certificate NumberFacility IDFacility NameAge GroupGenderRaceEthnicityLength of StayType of AdmissionPatient DispositionAPR DRG CodeAPR DRG DescriptionAPR MDC CodeAPR MDC DescriptionAPR Severity of Illness CodeAPR Severity of Illness DescriptionAPR Risk of MortalityAPR Medical Surgical DescriptionSource of Payment 1Emergency Department IndicatorTotal ChargesTotal Costs
1805584New York CityKings7001037.01320.0University Hospital of Brooklyn50 to 69FBlack/African AmericanNot Span/Hispanic8EmergencyHome or Self Care53SEIZURE1Diseases and Disorders of the Nervous System2ModerateModerateMedicalInsurance CompanyY31404.0028633.04
1921786New York CityManhattan7002053.01446.0NYU Hospital for Joint Diseases30 to 49FWhiteNot Span/Hispanic5ElectiveHome w/ Home Health Services301HIP JOINT REPLACEMENT8Diseases and Disorders of the Musculoskeletal System and Conn Tissue2ModerateMinorSurgicalInsurance CompanyN104074.8129504.26
1933566New York CityManhattan7002012.01447.0Hospital for Special Surgery50 to 69MWhiteNot Span/Hispanic11ElectiveHome or Self Care313KNEE & LOWER LEG PROCEDURES EXCEPT FOOT8Diseases and Disorders of the Musculoskeletal System and Conn Tissue2ModerateMinorSurgicalOther Non-Federal ProgramN105382.0742137.14
1135536Long IslandSuffolk5154001.0925.0Good Samaritan Hospital Medical Center18 to 29MWhiteSpanish/Hispanic1EmergencyHome or Self Care113INFECTIONS OF UPPER RESPIRATORY TRACT3Ear, Nose, Mouth, Throat and Craniofacial Diseases and Disorders2ModerateMinorMedicalMedicareY14537.002154.88
1593468New York CityKings7001009.01294.0Coney Island Hospital50 to 69MBlack/African AmericanNot Span/Hispanic3EmergencyHome or Self Care241PEPTIC ULCER & GASTRITIS6Diseases and Disorders of the Digestive System3MajorModerateMedicalMedicaidY11610.518155.50
1029498Capital/AdironSchenectady4601001.0848.0Ellis Hospital - Bellevue Woman's Care Center Division0 to 17FWhiteNot Span/Hispanic2NewbornHome or Self Care626NEONATE BWT 2000-2499G, NORMAL NEWBORN OR NEONATE W OTHER PROBLEM15Newborns and Other Neonates with Conditions Originating in the Perinatal Period2ModerateMinorMedicalMedicaidN2911.141703.84
1054348Long IslandSuffolk5123000.0885.0Brookhaven Memorial Hospital Medical Center Inc70 or OlderFWhiteUnknown7EmergencySkilled Nursing Home660MAJOR HEMATOLOGIC/IMMUNOLOGIC DIAG EXC SICKLE CELL CRISIS & COAGUL16Diseases and Disorders of Blood, Blood Forming Organs and Immunological Disorders3MajorMajorMedicalBlue CrossN52203.2211977.98
1054472Long IslandSuffolk5123000.0885.0Brookhaven Memorial Hospital Medical Center Inc70 or OlderFWhiteUnknown3EmergencySkilled Nursing Home342FRACTURES & DISLOCATIONS EXCEPT FEMUR, PELVIS & BACK8Diseases and Disorders of the Musculoskeletal System and Conn Tissue2ModerateMinorMedicalMedicareY20398.373763.25
2240119New York CityManhattan7002032.01469.0St Lukes Roosevelt Hospital - St Lukes Hospital Division70 or OlderFBlack/African AmericanNot Span/Hispanic1UrgentHome or Self Care175PERCUTANEOUS CARDIOVASCULAR PROCEDURES W/O AMI5Diseases and Disorders of the Circulatory System2ModerateModerateSurgicalMedicareN36699.6518386.37
1877534New York CityManhattan7002002.01439.0Beth Israel Medical Center/Petrie Campus70 or OlderFWhiteNot Span/Hispanic1EmergencyHome or Self Care422HYPOVOLEMIA & RELATED ELECTROLYTE DISORDERS10Endocrine, Nutritional and Metabolic Diseases and Disorders3MajorModerateMedicalMedicareY8867.152478.94
Health Service AreaHospital CountyOperating Certificate NumberFacility IDFacility NameAge GroupGenderRaceEthnicityLength of StayType of AdmissionPatient DispositionAPR DRG CodeAPR DRG DescriptionAPR MDC CodeAPR MDC DescriptionAPR Severity of Illness CodeAPR Severity of Illness DescriptionAPR Risk of MortalityAPR Medical Surgical DescriptionSource of Payment 1Emergency Department IndicatorTotal ChargesTotal Costs
856149Finger LakesOntario3421000.0676.0Clifton Springs Hospital and Clinic18 to 29MWhiteNot Span/Hispanic1EmergencyShort-term Hospital720SEPTICEMIA & DISSEMINATED INFECTIONS18Infectious and Parasitic Diseases, Systemic or Unspecified Sites3MajorMajorMedicalInsurance CompanyY4217.941966.79
1759721New York CityKings7001046.01309.0Interfaith Medical Center50 to 69MBlack/African AmericanNot Span/Hispanic8EmergencyHome or Self Care860REHABILITATION23Rehabilitation, Aftercare, Other Factors Influencing Health Status and Other Health Service Contacts1MinorMinorMedicalMedicaidY29167.127445.34
68660Southern TierBroome303001.042.0United Health Services Hospitals Inc. - Binghamton General Hospital50 to 69FWhiteNot Span/Hispanic2UrgentHome or Self Care115OTHER EAR, NOSE, MOUTH,THROAT & CRANIAL/FACIAL DIAGNOSES3Ear, Nose, Mouth, Throat and Craniofacial Diseases and Disorders2ModerateModerateMedicalBlue CrossN5658.702853.83
1708192New York CityKings7001020.01305.0Maimonides Medical Center70 or OlderMOther RaceNot Span/Hispanic2UrgentHome or Self Care143OTHER RESPIRATORY DIAGNOSES EXCEPT SIGNS, SYMPTOMS & MINOR DIAGNOSES4Diseases and Disorders of the Respiratory System1MinorMinorMedicalMedicareN22800.007072.47
2369422New York CityQueens7003006.01632.0Peninsula Hospital Center70 or OlderFBlack/African AmericanNot Span/Hispanic47EmergencyInpatient Rehabilitation Facility720SEPTICEMIA & DISSEMINATED INFECTIONS18Infectious and Parasitic Diseases, Systemic or Unspecified Sites4ExtremeExtremeMedicalMedicareN196745.3070501.68
325759Long IslandSuffolk5151001.0245.0University Hospital30 to 49FWhiteNot Span/Hispanic2UrgentHome or Self Care560VAGINAL DELIVERY14Pregnancy, Childbirth and the Puerperium2ModerateMinorMedicalBlue CrossN10437.384483.25
109687Western NYChautauqua601000.098.0Brooks Memorial Hospital30 to 49MWhiteNot Span/Hispanic6EmergencyHome or Self Care244DIVERTICULITIS & DIVERTICULOSIS6Diseases and Disorders of the Digestive System2ModerateMinorMedicalInsurance CompanyY10437.715467.92
641539Long IslandNassau2951001.0541.0North Shore University Hospital70 or OlderFWhiteNot Span/Hispanic5EmergencySkilled Nursing Home308HIP & FEMUR PROCEDURES FOR TRAUMA EXCEPT JOINT REPLACEMENT8Diseases and Disorders of the Musculoskeletal System and Conn Tissue2ModerateModerateSurgicalMedicareY56847.7013424.24
2186357New York CityManhattan7002054.01464.0New York Presbyterian Hospital - Columbia Presbyterian Center50 to 69MWhiteNot Span/Hispanic1UrgentHome or Self Care175PERCUTANEOUS CARDIOVASCULAR PROCEDURES W/O AMI5Diseases and Disorders of the Circulatory System3MajorModerateSurgicalInsurance CompanyN52881.6615119.88
616604Long IslandNassau2950002.0528.0Nassau University Medical Center70 or OlderFWhiteNot Span/Hispanic2EmergencyHome or Self Care47TRANSIENT ISCHEMIA1Diseases and Disorders of the Nervous System2ModerateModerateMedicalBlue CrossY8366.687055.45